Efficient large-scale sequence comparison by locality-sensitive hashing

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient large-scale sequence comparison by locality-sensitive hashing

MOTIVATION Comparison of multimegabase genomic DNA sequences is a popular technique for finding and annotating conserved genome features. Performing such comparisons entails finding many short local alignments between sequences up to tens of megabases in length. To process such long sequences efficiently, existing algorithms find alignments by expanding around short runs of matching bases with ...

متن کامل

Large-Scale Distributed Locality-Sensitive Hashing for General Metric Data

Locality-Sensitive Hashing (LSH) is extremely competitive for similarity search, but works under the assumption of uniform access cost to the data, and for just a handful of dissimilarities for which locality-sensitive families are available. In this work we propose Parallel Voronoi LSH, an approach that addresses those two limitations of LSH: it makes LSH efficient for distributedmemory archit...

متن کامل

Large Scale Machine Learning Jan 18 , 2016 Lecture 5 : Large - Scale Search : Locality Sensitive Hashing ( LSH )

Nowadays, there exist hundreds of millions of images online. These images are either stored in web pages, or databases of companies, such as Facebook, Flickr, etc. It is challenging to quickly find similar images from these huge repositories. This is because: • The repositories are huge. Facebook has around 10 billion images [2]. These images have different resolution, dimension. • Images are v...

متن کامل

Beyond Locality-Sensitive Hashing

We present a new data structure for the c-approximate near neighbor problem (ANN) in the Euclidean space. For n points in R, our algorithm achieves Oc(n + d logn) query time and Oc(n + d logn) space, where ρ ≤ 7/(8c2) + O(1/c3) + oc(1). This is the first improvement over the result by Andoni and Indyk (FOCS 2006) and the first data structure that bypasses a locality-sensitive hashing lower boun...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Bioinformatics

سال: 2001

ISSN: 1367-4803,1460-2059

DOI: 10.1093/bioinformatics/17.5.419